Use the training `end_e` as the `evaluation(..., epsilon=end_e)` for atari #430

pseudo-rnd-thoughts · 2023-11-15T14:13:25Z

Description

Bug fix for #429
I could repeat this for all DQN, C51 agents that have an end_e argument to prevent this issue in the future
A potential alternative change is to add a new parameter for the evaluation epsilon

Types of changes

Bug fix
New feature
New algorithm
Documentation

Checklist:

I've read the CONTRIBUTION guide (required).
I have ensured pre-commit run --all-files passes (required).
I have updated the tests accordingly (if applicable).
I have updated the documentation and previewed the changes via mkdocs serve.
- I have explained note-worthy implementation details.
- I have explained the logged metrics.
- I have added links to the original paper and related papers.

If you need to run benchmark experiments for a performance-impacting changes:

I have contacted @vwxyzjn to obtain access to the openrlbenchmark W&B team.
I have used the benchmark utility to submit the tracked experiments to the openrlbenchmark/cleanrl W&B project, optionally with --capture-video.
I have performed RLops with python -m openrlbenchmark.rlops.
- For new feature or bug fix:
  - I have used the RLops utility to understand the performance impact of the changes and confirmed there is no regression.
- For new algorithm:
  - I have created a table comparing my results against those from reputable sources (i.e., the original paper or other reference implementation).
- I have added the learning curves generated by the python -m openrlbenchmark.rlops utility to the documentation.
- I have added links to the tracked experiments in W&B, generated by python -m openrlbenchmark.rlops ....your_args... --report, to the documentation.

vercel · 2023-11-15T14:13:29Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
cleanrl	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Dec 6, 2023 0:30am

pseudo-rnd-thoughts · 2023-12-06T10:34:26Z

@vwxyzjn Do you want to rerun all of the scripts because the final evaluation data is not used commonly or can this just be merged without?

Use the training end_e as the evaluation(..., epsilon=end_e)

93d2c8b

vercel bot deployed to Preview November 15, 2023 14:13 View deployment

Merge branch 'master' into atari-evaluation-epsilon

71eb95f

vercel bot deployed to Preview December 6, 2023 00:30 View deployment

pseudo-rnd-thoughts requested a review from vwxyzjn December 6, 2023 10:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use the training `end_e` as the `evaluation(..., epsilon=end_e)` for atari #430

Use the training `end_e` as the `evaluation(..., epsilon=end_e)` for atari #430

pseudo-rnd-thoughts commented Nov 15, 2023 •

edited

Loading

vercel bot commented Nov 15, 2023 •

edited

Loading

pseudo-rnd-thoughts commented Dec 6, 2023

Use the training end_e as the evaluation(..., epsilon=end_e) for atari #430

Are you sure you want to change the base?

Use the training end_e as the evaluation(..., epsilon=end_e) for atari #430

Conversation

pseudo-rnd-thoughts commented Nov 15, 2023 • edited Loading

Description

Types of changes

Checklist:

vercel bot commented Nov 15, 2023 • edited Loading

pseudo-rnd-thoughts commented Dec 6, 2023

Use the training `end_e` as the `evaluation(..., epsilon=end_e)` for atari #430

Use the training `end_e` as the `evaluation(..., epsilon=end_e)` for atari #430

pseudo-rnd-thoughts commented Nov 15, 2023 •

edited

Loading

vercel bot commented Nov 15, 2023 •

edited

Loading